Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 507596 |
| Missing cells | 169407 |
| Missing cells (%) | 2.1% |
| Duplicate rows | 41270 |
| Duplicate rows (%) | 8.1% |
| Total size in memory | 62.0 MiB |
| Average record size in memory | 128.0 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 6 |
| Dataset has 41270 (8.1%) duplicate rows | Duplicates |
name has a high cardinality: 43417 distinct values | High cardinality |
host_name has a high cardinality: 5540 distinct values | High cardinality |
neighbourhood has a high cardinality: 131 distinct values | High cardinality |
last_review has a high cardinality: 1784 distinct values | High cardinality |
last_review has 84415 (16.6%) missing values | Missing |
reviews_per_month has 84415 (16.6%) missing values | Missing |
price is highly skewed (γ1 = 31.17485553) | Skewed |
minimum_nights is highly skewed (γ1 = 45.94216724) | Skewed |
number_of_reviews has 84415 (16.6%) zeros | Zeros |
availability_365 has 50453 (9.9%) zeros | Zeros |
Reproduction
| Analysis started | 2021-03-04 15:23:39.135335 |
|---|---|
| Analysis finished | 2021-03-04 15:25:12.662725 |
| Duration | 1 minute and 33.53 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
id
Real number (ℝ≥0)
| Distinct | 35458 |
|---|---|
| Distinct (%) | 7.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21675360.65 |
|---|---|
| Minimum | 6499 |
| Maximum | 48142332 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 6499 |
|---|---|
| 5-th percentile | 1983681 |
| Q1 | 13040692 |
| median | 21897264 |
| Q3 | 30636521 |
| 95-th percentile | 40940564 |
| Maximum | 48142332 |
| Range | 48135833 |
| Interquartile range (IQR) | 17595829 |
Descriptive statistics
| Standard deviation | 11805309.23 |
|---|---|
| Coefficient of variation (CV) | 0.5446418824 |
| Kurtosis | -0.8825681139 |
| Mean | 21675360.65 |
| Median Absolute Deviation (MAD) | 8822179 |
| Skewness | -0.03753418671 |
| Sum | 1.100232637e+13 |
| Variance | 1.39365326e+14 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 28242270 | 22 | < 0.1% | |
| 2349587 | 22 | < 0.1% | |
| 19668551 | 22 | < 0.1% | |
| 8235229 | 22 | < 0.1% | |
| 7367013 | 22 | < 0.1% | |
| 20910474 | 22 | < 0.1% | |
| 945673 | 22 | < 0.1% | |
| 17829317 | 22 | < 0.1% | |
| 28068321 | 22 | < 0.1% | |
| 17556920 | 22 | < 0.1% | |
| Other values (35448) | 507376 | > 99.9% |
| Value | Count | Frequency (%) | |
| 6499 | 17 | < 0.1% | |
| 24056 | 4 | < 0.1% | |
| 25659 | 22 | < 0.1% | |
| 26993 | 4 | < 0.1% | |
| 28066 | 12 | < 0.1% |
| Value | Count | Frequency (%) | |
| 48142332 | 1 | < 0.1% | |
| 48139646 | 1 | < 0.1% | |
| 48136017 | 1 | < 0.1% | |
| 48135943 | 1 | < 0.1% | |
| 48132640 | 1 | < 0.1% |
| Distinct | 43417 |
|---|---|
| Distinct (%) | 8.6% |
| Missing | 391 |
| Missing (%) | 0.1% |
| Memory size | 3.9 MiB |
| Quinta da Bicuda - Estúdio Bungalow | 198 |
|---|---|
| West Coast Surf Hostel | 167 |
| Brand New Hostel in center of Lisbon. | 139 |
| NLC Rooms & Suites,made by Travelers for Travelers | 126 |
| Luzeiros Suites | 121 |
| Other values (43412) |
| Value | Count | Frequency (%) | |
| Quinta da Bicuda - Estúdio Bungalow | 198 | < 0.1% | |
| West Coast Surf Hostel | 167 | < 0.1% | |
| Brand New Hostel in center of Lisbon. | 139 | < 0.1% | |
| NLC Rooms & Suites,made by Travelers for Travelers | 126 | < 0.1% | |
| Luzeiros Suites | 121 | < 0.1% | |
| Bairro Alto Apartment | 119 | < 0.1% | |
| Apartment in the heart of Lisbon | 114 | < 0.1% | |
| Captain Cook | 110 | < 0.1% | |
| Wellcome to Moby Dick Lodge | 110 | < 0.1% | |
| One Bedroom Apartment | 106 | < 0.1% | |
| Other values (43407) | 505895 | 99.7% | |
| (Missing) | 391 | 0.1% |
Frequencies of value counts
Unique
| Unique | 4267 ? |
|---|---|
| Unique (%) | 0.8% |
Histogram of lengths of the category
Length
| Max length | 245 |
|---|---|
| Median length | 33 |
| Mean length | 33.43575599 |
| Min length | 1 |
host_id
Real number (ℝ≥0)
| Distinct | 14888 |
|---|---|
| Distinct (%) | 2.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 90671110.3 |
|---|---|
| Minimum | 14455 |
| Maximum | 387871064 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 14455 |
|---|---|
| 5-th percentile | 1756107 |
| Q1 | 14629738 |
| median | 55511500 |
| Q3 | 153744885 |
| 95-th percentile | 268532774 |
| Maximum | 387871064 |
| Range | 387856609 |
| Interquartile range (IQR) | 139115147 |
Descriptive statistics
| Standard deviation | 90303453.14 |
|---|---|
| Coefficient of variation (CV) | 0.9959451566 |
| Kurtosis | -0.1735445391 |
| Mean | 90671110.3 |
| Median Absolute Deviation (MAD) | 50146577 |
| Skewness | 0.9307269064 |
| Sum | 4.60242929e+13 |
| Variance | 8.15471365e+15 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3953109 | 6581 | 1.3% | |
| 1756107 | 3483 | 0.7% | |
| 104083974 | 3052 | 0.6% | |
| 76223539 | 1609 | 0.3% | |
| 11707167 | 1474 | 0.3% | |
| 22192546 | 1471 | 0.3% | |
| 2372087 | 1428 | 0.3% | |
| 140893245 | 1339 | 0.3% | |
| 688914 | 1248 | 0.2% | |
| 1969293 | 1228 | 0.2% | |
| Other values (14878) | 484683 | 95.5% |
| Value | Count | Frequency (%) | |
| 14455 | 17 | < 0.1% | |
| 17096 | 22 | < 0.1% | |
| 51461 | 17 | < 0.1% | |
| 58150 | 37 | < 0.1% | |
| 60717 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 387871064 | 1 | < 0.1% | |
| 387506721 | 1 | < 0.1% | |
| 387380530 | 1 | < 0.1% | |
| 387006368 | 1 | < 0.1% | |
| 386842063 | 1 | < 0.1% |
| Distinct | 5540 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 186 |
| Missing (%) | < 0.1% |
| Memory size | 3.9 MiB |
| Maria | 12230 |
|---|---|
| Ana | 10390 |
| Pedro | 9936 |
| João | 6951 |
| Feels Like Home | 6581 |
| Other values (5535) |
| Value | Count | Frequency (%) | |
| Maria | 12230 | 2.4% | |
| Ana | 10390 | 2.0% | |
| Pedro | 9936 | 2.0% | |
| João | 6951 | 1.4% | |
| Feels Like Home | 6581 | 1.3% | |
| Luis | 6394 | 1.3% | |
| Joana | 5371 | 1.1% | |
| Ricardo | 5265 | 1.0% | |
| Nuno | 4966 | 1.0% | |
| Miguel | 4812 | 0.9% | |
| Other values (5530) | 434514 | 85.6% |
Frequencies of value counts
Unique
| Unique | 195 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 35 |
|---|---|
| Median length | 6 |
| Mean length | 8.34922458 |
| Min length | 1 |
neighbourhood_group
Categorical
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 MiB |
| Lisboa | |
|---|---|
| Cascais | |
| Sintra | 29520 |
| Mafra | 28475 |
| Lourinh | 8241 |
| Other values (11) | 27933 |
| Value | Count | Frequency (%) | |
| Lisboa | 367617 | 72.4% | |
| Cascais | 45810 | 9.0% | |
| Sintra | 29520 | 5.8% | |
| Mafra | 28475 | 5.6% | |
| Lourinh | 8241 | 1.6% | |
| Oeiras | 8183 | 1.6% | |
| Torres Vedras | 6023 | 1.2% | |
| Loures | 3678 | 0.7% | |
| Amadora | 2943 | 0.6% | |
| Odivelas | 2270 | 0.4% | |
| Other values (6) | 4836 | 1.0% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 21 |
|---|---|
| Median length | 6 |
| Mean length | 6.200535465 |
| Min length | 5 |
| Distinct | 131 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 MiB |
| Santa Maria Maior | |
|---|---|
| Misericrdia | |
| Arroios | |
| Cascais e Estoril | |
| Santo Antnio | |
| Other values (126) |
| Value | Count | Frequency (%) | |
| Santa Maria Maior | 78987 | 15.6% | |
| Misericrdia | 62767 | 12.4% | |
| Arroios | 47179 | 9.3% | |
| Cascais e Estoril | 31925 | 6.3% | |
| Santo Antnio | 30283 | 6.0% | |
| So Vicente | 29896 | 5.9% | |
| Estrela | 22276 | 4.4% | |
| Ericeira | 17120 | 3.4% | |
| Avenidas Novas | 15951 | 3.1% | |
| S.Maria, S.Miguel, S.Martinho, S.Pedro Penaferrim | 12599 | 2.5% | |
| Other values (121) | 158613 | 31.2% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 49 |
|---|---|
| Median length | 11 |
| Mean length | 13.6944775 |
| Min length | 3 |
latitude
Real number (ℝ≥0)
| Distinct | 15168 |
|---|---|
| Distinct (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.75615571 |
|---|---|
| Minimum | 38.67509 |
| Maximum | 39.31047 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 38.67509 |
|---|---|
| 5-th percentile | 38.69913 |
| Q1 | 38.711 |
| median | 38.71759 |
| Q3 | 38.7387925 |
| 95-th percentile | 38.98144 |
| Maximum | 39.31047 |
| Range | 0.63538 |
| Interquartile range (IQR) | 0.0277925 |
Descriptive statistics
| Standard deviation | 0.1049346094 |
|---|---|
| Coefficient of variation (CV) | 0.002707559805 |
| Kurtosis | 9.904906526 |
| Mean | 38.75615571 |
| Median Absolute Deviation (MAD) | 0.0093 |
| Skewness | 3.122251071 |
| Sum | 19672469.62 |
| Variance | 0.01101127225 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 38.71188 | 546 | 0.1% | |
| 38.71212 | 459 | 0.1% | |
| 38.71201 | 407 | 0.1% | |
| 38.7113 | 406 | 0.1% | |
| 38.71164 | 397 | 0.1% | |
| 38.71236 | 385 | 0.1% | |
| 38.71258 | 383 | 0.1% | |
| 38.71272 | 383 | 0.1% | |
| 38.71223 | 381 | 0.1% | |
| 38.71108 | 377 | 0.1% | |
| Other values (15158) | 503472 | 99.2% |
| Value | Count | Frequency (%) | |
| 38.67509 | 13 | < 0.1% | |
| 38.67552 | 9 | < 0.1% | |
| 38.67563 | 2 | < 0.1% | |
| 38.67577 | 9 | < 0.1% | |
| 38.67605 | 5 | < 0.1% |
| Value | Count | Frequency (%) | |
| 39.31047 | 1 | < 0.1% | |
| 39.30995 | 1 | < 0.1% | |
| 39.30461 | 1 | < 0.1% | |
| 39.30367 | 11 | < 0.1% | |
| 39.30196 | 5 | < 0.1% |
longitude
Real number (ℝ)
| Distinct | 18227 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -9.203225179 |
|---|---|
| Minimum | -9.49852 |
| Maximum | -8.81186 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | -9.49852 |
|---|---|
| 5-th percentile | -9.42647 |
| Q1 | -9.22565 |
| median | -9.14683 |
| Q3 | -9.13461 |
| 95-th percentile | -9.12249 |
| Maximum | -8.81186 |
| Range | 0.68666 |
| Interquartile range (IQR) | 0.09104 |
Descriptive statistics
| Standard deviation | 0.1108528815 |
|---|---|
| Coefficient of variation (CV) | -0.01204500371 |
| Kurtosis | -0.06387595829 |
| Mean | -9.203225179 |
| Median Absolute Deviation (MAD) | 0.015835 |
| Skewness | -1.213098336 |
| Sum | -4671520.288 |
| Variance | 0.01228836133 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| -9.14464 | 295 | 0.1% | |
| -9.14431 | 287 | 0.1% | |
| -9.14446 | 285 | 0.1% | |
| -9.14404 | 279 | 0.1% | |
| -9.1342 | 266 | 0.1% | |
| -9.13574 | 258 | 0.1% | |
| -9.14455 | 256 | 0.1% | |
| -9.13504 | 250 | < 0.1% | |
| -9.1337 | 247 | < 0.1% | |
| -9.13508 | 244 | < 0.1% | |
| Other values (18217) | 504929 | 99.5% |
| Value | Count | Frequency (%) | |
| -9.49852 | 22 | < 0.1% | |
| -9.48813 | 4 | < 0.1% | |
| -9.48808 | 5 | < 0.1% | |
| -9.48789 | 5 | < 0.1% | |
| -9.48648 | 13 | < 0.1% |
| Value | Count | Frequency (%) | |
| -8.81186 | 1 | < 0.1% | |
| -8.83793 | 1 | < 0.1% | |
| -8.83827 | 8 | < 0.1% | |
| -8.84009 | 5 | < 0.1% | |
| -8.84606 | 11 | < 0.1% |
room_type
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 MiB |
| Entire home/apt | |
|---|---|
| Private room | |
| Shared room | 8666 |
| Hotel room | 8529 |
| Value | Count | Frequency (%) | |
| Entire home/apt | 373761 | 73.6% | |
| Private room | 116640 | 23.0% | |
| Shared room | 8666 | 1.7% | |
| Hotel room | 8529 | 1.7% |
Frequencies of value counts
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Histogram of lengths of the category
Length
| Max length | 15 |
|---|---|
| Median length | 15 |
| Mean length | 14.15832867 |
| Min length | 10 |
| Distinct | 895 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 99.36494377 |
|---|---|
| Minimum | 0 |
| Maximum | 20970 |
| Zeros | 832 |
| Zeros (%) | 0.2% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 22 |
| Q1 | 45 |
| median | 65 |
| Q3 | 100 |
| 95-th percentile | 239 |
| Maximum | 20970 |
| Range | 20970 |
| Interquartile range (IQR) | 55 |
Descriptive statistics
| Standard deviation | 247.2691823 |
|---|---|
| Coefficient of variation (CV) | 2.488495167 |
| Kurtosis | 1433.487107 |
| Mean | 99.36494377 |
| Median Absolute Deviation (MAD) | 25 |
| Skewness | 31.17485553 |
| Sum | 50437248 |
| Variance | 61142.04852 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 50 | 24811 | 4.9% | |
| 60 | 23029 | 4.5% | |
| 100 | 19108 | 3.8% | |
| 80 | 18307 | 3.6% | |
| 70 | 17575 | 3.5% | |
| 40 | 16744 | 3.3% | |
| 45 | 15475 | 3.0% | |
| 65 | 15200 | 3.0% | |
| 55 | 14638 | 2.9% | |
| 75 | 13353 | 2.6% | |
| Other values (885) | 329356 | 64.9% |
| Value | Count | Frequency (%) | |
| 0 | 832 | 0.2% | |
| 7 | 2 | < 0.1% | |
| 8 | 26 | < 0.1% | |
| 9 | 563 | 0.1% | |
| 10 | 824 | 0.2% |
| Value | Count | Frequency (%) | |
| 20970 | 1 | < 0.1% | |
| 20764 | 1 | < 0.1% | |
| 20604 | 1 | < 0.1% | |
| 20282 | 1 | < 0.1% | |
| 20199 | 1 | < 0.1% |
| Distinct | 80 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.089043255 |
|---|---|
| Minimum | 1 |
| Maximum | 1000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 5 |
| Maximum | 1000 |
| Range | 999 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 12.58249195 |
|---|---|
| Coefficient of variation (CV) | 4.073265057 |
| Kurtosis | 3051.348739 |
| Mean | 3.089043255 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 45.94216724 |
| Sum | 1567986 |
| Variance | 158.3191037 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 2 | 189376 | 37.3% | |
| 1 | 138521 | 27.3% | |
| 3 | 116281 | 22.9% | |
| 4 | 24436 | 4.8% | |
| 5 | 16576 | 3.3% | |
| 7 | 8583 | 1.7% | |
| 6 | 3333 | 0.7% | |
| 30 | 3006 | 0.6% | |
| 15 | 1808 | 0.4% | |
| 28 | 880 | 0.2% | |
| Other values (70) | 4796 | 0.9% |
| Value | Count | Frequency (%) | |
| 1 | 138521 | 27.3% | |
| 2 | 189376 | 37.3% | |
| 3 | 116281 | 22.9% | |
| 4 | 24436 | 4.8% | |
| 5 | 16576 | 3.3% |
| Value | Count | Frequency (%) | |
| 1000 | 34 | < 0.1% | |
| 800 | 2 | < 0.1% | |
| 730 | 6 | < 0.1% | |
| 400 | 3 | < 0.1% | |
| 365 | 82 | < 0.1% |
| Distinct | 642 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.35798351 |
|---|---|
| Minimum | 0 |
| Maximum | 981 |
| Zeros | 84415 |
| Zeros (%) | 16.6% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 13 |
| Q3 | 51 |
| 95-th percentile | 175 |
| Maximum | 981 |
| Range | 981 |
| Interquartile range (IQR) | 49 |
Descriptive statistics
| Standard deviation | 63.68356248 |
|---|---|
| Coefficient of variation (CV) | 1.577966908 |
| Kurtosis | 10.94764889 |
| Mean | 40.35798351 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 2.786294066 |
| Sum | 20485551 |
| Variance | 4055.59613 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 84415 | 16.6% | |
| 1 | 34291 | 6.8% | |
| 2 | 23580 | 4.6% | |
| 3 | 18710 | 3.7% | |
| 4 | 15027 | 3.0% | |
| 5 | 12940 | 2.5% | |
| 6 | 11039 | 2.2% | |
| 7 | 10866 | 2.1% | |
| 8 | 9233 | 1.8% | |
| 9 | 8388 | 1.7% | |
| Other values (632) | 279107 | 55.0% |
| Value | Count | Frequency (%) | |
| 0 | 84415 | 16.6% | |
| 1 | 34291 | 6.8% | |
| 2 | 23580 | 4.6% | |
| 3 | 18710 | 3.7% | |
| 4 | 15027 | 3.0% |
| Value | Count | Frequency (%) | |
| 981 | 1 | < 0.1% | |
| 941 | 1 | < 0.1% | |
| 877 | 3 | < 0.1% | |
| 876 | 1 | < 0.1% | |
| 872 | 1 | < 0.1% |
| Distinct | 1784 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 84415 |
| Missing (%) | 16.6% |
| Memory size | 3.9 MiB |
| 2020-01-02 | 3665 |
|---|---|
| 2020-03-15 | 3367 |
| 2020-01-01 | 2991 |
| 2020-03-16 | 2911 |
| 2020-01-03 | 2688 |
| Other values (1779) |
| Value | Count | Frequency (%) | |
| 2020-01-02 | 3665 | 0.7% | |
| 2020-03-15 | 3367 | 0.7% | |
| 2020-01-01 | 2991 | 0.6% | |
| 2020-03-16 | 2911 | 0.6% | |
| 2020-01-03 | 2688 | 0.5% | |
| 2020-03-14 | 2393 | 0.5% | |
| 2020-03-08 | 2297 | 0.5% | |
| 2020-03-13 | 2229 | 0.4% | |
| 2020-03-09 | 2218 | 0.4% | |
| 2019-12-09 | 2112 | 0.4% | |
| Other values (1774) | 396310 | 78.1% | |
| (Missing) | 84415 | 16.6% |
Frequencies of value counts
Unique
| Unique | 28 ? |
|---|---|
| Unique (%) | < 0.1% |
Histogram of lengths of the category
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 8.835875381 |
| Min length | 3 |
| Distinct | 1247 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 84415 |
| Missing (%) | 16.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.525255813 |
|---|---|
| Minimum | 0.01 |
| Maximum | 59.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 0.07 |
| Q1 | 0.35 |
| median | 1.01 |
| Q3 | 2.3 |
| 95-th percentile | 4.51 |
| Maximum | 59.3 |
| Range | 59.29 |
| Interquartile range (IQR) | 1.95 |
Descriptive statistics
| Standard deviation | 1.518659671 |
|---|---|
| Coefficient of variation (CV) | 0.9956753865 |
| Kurtosis | 18.92782289 |
| Mean | 1.525255813 |
| Median Absolute Deviation (MAD) | 0.8 |
| Skewness | 2.041651464 |
| Sum | 645459.28 |
| Variance | 2.306327195 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 4720 | 0.9% | |
| 0.06 | 4330 | 0.9% | |
| 0.09 | 4294 | 0.8% | |
| 0.1 | 4222 | 0.8% | |
| 0.07 | 4210 | 0.8% | |
| 0.05 | 4098 | 0.8% | |
| 0.11 | 3956 | 0.8% | |
| 0.08 | 3825 | 0.8% | |
| 0.13 | 3790 | 0.7% | |
| 0.12 | 3769 | 0.7% | |
| Other values (1237) | 381967 | 75.3% | |
| (Missing) | 84415 | 16.6% |
| Value | Count | Frequency (%) | |
| 0.01 | 278 | 0.1% | |
| 0.02 | 2191 | 0.4% | |
| 0.03 | 3419 | 0.7% | |
| 0.04 | 3453 | 0.7% | |
| 0.05 | 4098 | 0.8% |
| Value | Count | Frequency (%) | |
| 59.3 | 1 | < 0.1% | |
| 52.04 | 1 | < 0.1% | |
| 51.61 | 1 | < 0.1% | |
| 44.75 | 1 | < 0.1% | |
| 43.64 | 1 | < 0.1% |
calculated_host_listings_count
Real number (ℝ≥0)
| Distinct | 146 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.42133114 |
|---|---|
| Minimum | 1 |
| Maximum | 340 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 3 |
| Q3 | 10 |
| 95-th percentile | 62 |
| Maximum | 340 |
| Range | 339 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 39.70468949 |
|---|---|
| Coefficient of variation (CV) | 2.753191721 |
| Kurtosis | 37.4038927 |
| Mean | 14.42133114 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 5.786760295 |
| Sum | 7320210 |
| Variance | 1576.462368 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 154645 | 30.5% | |
| 2 | 61146 | 12.0% | |
| 3 | 41319 | 8.1% | |
| 4 | 33520 | 6.6% | |
| 5 | 24595 | 4.8% | |
| 6 | 21384 | 4.2% | |
| 7 | 15428 | 3.0% | |
| 8 | 13776 | 2.7% | |
| 9 | 10242 | 2.0% | |
| 10 | 9990 | 2.0% | |
| Other values (136) | 121551 | 23.9% |
| Value | Count | Frequency (%) | |
| 1 | 154645 | 30.5% | |
| 2 | 61146 | 12.0% | |
| 3 | 41319 | 8.1% | |
| 4 | 33520 | 6.6% | |
| 5 | 24595 | 4.8% |
| Value | Count | Frequency (%) | |
| 340 | 340 | 0.1% | |
| 337 | 337 | 0.1% | |
| 336 | 672 | 0.1% | |
| 330 | 330 | 0.1% | |
| 328 | 328 | 0.1% |
| Distinct | 366 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 224.6077609 |
|---|---|
| Minimum | 0 |
| Maximum | 365 |
| Zeros | 50453 |
| Zeros (%) | 9.9% |
| Memory size | 3.9 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 115 |
| median | 267 |
| Q3 | 343 |
| 95-th percentile | 365 |
| Maximum | 365 |
| Range | 365 |
| Interquartile range (IQR) | 228 |
Descriptive statistics
| Standard deviation | 128.8132767 |
|---|---|
| Coefficient of variation (CV) | 0.5735032314 |
| Kurtosis | -1.152750022 |
| Mean | 224.6077609 |
| Median Absolute Deviation (MAD) | 91 |
| Skewness | -0.5494561166 |
| Sum | 114010001 |
| Variance | 16592.86025 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 0 | 50453 | 9.9% | |
| 365 | 34352 | 6.8% | |
| 364 | 18585 | 3.7% | |
| 363 | 7787 | 1.5% | |
| 180 | 5611 | 1.1% | |
| 179 | 5562 | 1.1% | |
| 1 | 5485 | 1.1% | |
| 362 | 5339 | 1.1% | |
| 358 | 5168 | 1.0% | |
| 361 | 4256 | 0.8% | |
| Other values (356) | 364998 | 71.9% |
| Value | Count | Frequency (%) | |
| 0 | 50453 | 9.9% | |
| 1 | 5485 | 1.1% | |
| 2 | 1776 | 0.3% | |
| 3 | 869 | 0.2% | |
| 4 | 623 | 0.1% |
| Value | Count | Frequency (%) | |
| 365 | 34352 | 6.8% | |
| 364 | 18585 | 3.7% | |
| 363 | 7787 | 1.5% | |
| 362 | 5339 | 1.1% | |
| 361 | 4256 | 0.8% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6499 | Belém 1 Bedroom Historical Apartment | 14455 | Bruno | Lisboa | Belm | 38.69750 | -9.19768 | Entire home/apt | 79 | 3 | 26 | 2020-01-03 | 0.36 | 1 | 242 |
| 1 | 25659 | Heart of Alfama - Coeur d'Alfama - Lisbon Center | 107347 | Ellie | Lisboa | Santa Maria Maior | 38.71167 | -9.12696 | Entire home/apt | 45 | 3 | 113 | 2019-12-08 | 1.46 | 1 | 365 |
| 2 | 29248 | Apartamento Alfama com vista para o rio! | 125768 | Bárbara | Lisboa | Santa Maria Maior | 38.71272 | -9.12628 | Entire home/apt | 43 | 1 | 322 | 2020-06-14 | 2.74 | 1 | 329 |
| 3 | 29396 | Alfama Hill - Boutique apartment | 126415 | Mónica | Lisboa | Santa Maria Maior | 38.71239 | -9.12887 | Entire home/apt | 44 | 2 | 247 | 2020-08-23 | 2.45 | 2 | 331 |
| 4 | 29915 | Modern and Cool Apartment in Lisboa | 128890 | Sara | Lisboa | Avenidas Novas | 38.74712 | -9.15286 | Entire home/apt | 48 | 5 | 37 | 2020-01-21 | 0.30 | 1 | 293 |
| 5 | 33348 | Happy Season | 144484 | Bruno | Lisboa | Lumiar | 38.76381 | -9.15256 | Private room | 40 | 1 | 2 | 2011-07-22 | 0.02 | 2 | 0 |
| 6 | 40817 | Chiado, Alecrim walk to Riverfront | 176410 | S. | Lisboa | Misericrdia | 38.70898 | -9.14312 | Entire home/apt | 60 | 1 | 360 | 2020-01-27 | 3.06 | 15 | 357 |
| 7 | 42519 | Nice Apart.BAIRRO ALTO (ADAMASTOR) 6-1º | 136230 | David | Lisboa | Misericrdia | 38.71082 | -9.15090 | Entire home/apt | 70 | 1 | 114 | 2020-03-08 | 1.05 | 10 | 89 |
| 8 | 48025 | Apartment for renting in Lisbon | 218778 | José | Lisboa | Misericrdia | 38.71309 | -9.14392 | Entire home/apt | 65 | 5 | 17 | 2019-12-03 | 0.15 | 5 | 353 |
| 9 | 48058 | Small House Downtown Cascais | 218990 | Pim | Cascais | Cascais e Estoril | 38.69650 | -9.42571 | Entire home/apt | 86 | 5 | 33 | 2020-07-23 | 0.50 | 1 | 326 |
Last rows
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 507586 | 45488632 | Olissipo studio 2 by Innkeeper | 203773123 | João | Lisboa | Misericrdia | 38.710740 | -9.151370 | Entire home/apt | 35 | 1 | 0 | NaN | NaN | 16 | 356 |
| 507587 | 45495106 | Duplex terrasse vue sur le Castelo Sao Jorge | 13026506 | Pauline | Lisboa | Arroios | 38.718640 | -9.138130 | Entire home/apt | 240 | 2 | 0 | NaN | NaN | 1 | 25 |
| 507588 | 45500332 | Camões - Private bedroom in Chiado with AC | 155354572 | Jose | Lisboa | Misericrdia | 38.711750 | -9.144540 | Private room | 27 | 7 | 0 | NaN | NaN | 3 | 362 |
| 507589 | 45501381 | Camões - Large private bedroom in Chiado | 155354572 | Jose | Lisboa | Misericrdia | 38.709970 | -9.144250 | Private room | 33 | 30 | 0 | NaN | NaN | 3 | 270 |
| 507590 | 45504333 | Stylish Twin Room Balcony & PrivateWc Alameda IST | 23266121 | Joana & Francisco | Lisboa | Arroios | 38.736710 | -9.132160 | Private room | 60 | 29 | 0 | NaN | NaN | 6 | 0 |
| 507591 | 45505941 | Stylish & Quiet Room with PrivateWC- Alameda IST | 23266121 | Joana & Francisco | Lisboa | Arroios | 38.736281 | -9.131548 | Private room | 45 | 30 | 0 | NaN | NaN | 6 | 0 |
| 507592 | 45506121 | Stylish & Spacious Room w PrivateWC- Alameda IST | 23266121 | Joana & Francisco | Lisboa | Arroios | 38.735840 | -9.133190 | Private room | 45 | 30 | 0 | NaN | NaN | 6 | 0 |
| 507593 | 45506284 | Queen Bed w Balcony & PrivateWc - Alameda (IST) | 23266121 | Joana & Francisco | Lisboa | Arroios | 38.735527 | -9.132067 | Private room | 50 | 30 | 0 | NaN | NaN | 6 | 0 |
| 507594 | 45506608 | Quiet & Comfortable room w PrivateWC - Alameda IST | 23266121 | Joana & Francisco | Lisboa | Arroios | 38.735527 | -9.132067 | Private room | 45 | 29 | 0 | NaN | NaN | 6 | 262 |
| 507595 | 45506858 | Stylish UpperSuite Balcony PrivateWC- Alameda IST | 23266121 | Joana & Francisco | Lisboa | Arroios | 38.736630 | -9.132290 | Private room | 69 | 29 | 0 | NaN | NaN | 6 | 264 |
Most frequent
| id | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | price | minimum_nights | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 214 | 514936 | DAHOUSE LUXURY VILLA / CASCAIS | 643762 | Dahouse | Cascais | Alcabideche | 38.73282 | -9.39581 | Private room | 133 | 6 | 1 | 2013-08-01 | 0.01 | 2 | 364 | 20 |
| 1282 | 3255610 | Double Room in Eugaria Country House | 16464609 | Eugaria | Sintra | Colares | 38.79671 | -9.43184 | Private room | 105 | 2 | 2 | 2015-09-07 | 0.03 | 3 | 364 | 18 |
| 65 | 158255 | Charming Apartment in Lisbon | 743285 | Antonio | Lisboa | Estrela | 38.71009 | -9.15388 | Entire home/apt | 350 | 31 | 2 | 2015-12-28 | 0.03 | 1 | 363 | 15 |
| 277 | 643954 | Bela Vista à Graça + Parking | 3221454 | Ana Isabel | Lisboa | So Vicente | 38.71939 | -9.12974 | Entire home/apt | 120 | 1 | 2 | 2017-09-02 | 0.03 | 1 | 0 | 15 |
| 402 | 921088 | The nest of Telheiro de São Vicente | 4950865 | Cândida E José | Lisboa | So Vicente | 38.71493 | -9.12847 | Entire home/apt | 65 | 1 | 1 | 2013-05-07 | 0.01 | 1 | 365 | 15 |
| 700 | 1736839 | Historic Lisbon overlooking Tejo | 9143632 | João Miguel | Lisboa | Estrela | 38.70551 | -9.15761 | Private room | 20 | 2 | 1 | 2015-07-18 | 0.02 | 1 | 358 | 15 |
| 1782 | 5257983 | Sunny & spacious room in Alfama | 27215192 | Max | Lisboa | Santa Maria Maior | 38.70862 | -9.13076 | Private room | 35 | 1 | 1 | 2015-02-28 | 0.02 | 1 | 0 | 15 |
| 2042 | 6225662 | Room with a bathroom, kitchen | 32286650 | Roman | Lisboa | Arroios | 38.73093 | -9.13573 | Private room | 17 | 1 | 1 | 2015-05-21 | 0.02 | 1 | 0 | 15 |
| 2084 | 6316747 | Victoria House Flat I (10 pax!!) | 32865630 | Laços | Lisboa | Avenidas Novas | 38.73372 | -9.14615 | Entire home/apt | 130 | 7 | 1 | 2015-08-13 | 0.02 | 2 | 0 | 15 |
| 2170 | 6484834 | Cosy apartment | 1626484 | Carolina | Lisboa | Alcntara | 38.70612 | -9.18458 | Entire home/apt | 80 | 1 | 1 | 2015-08-23 | 0.02 | 1 | 0 | 15 |